---
id: weaviate
title: Weaviate
sidebar_label: Weaviate
---

## Quick Summary

**Weaviate** is a cloud-native, open-source vector database that uses state-of-the-art ML models to embed data. It is fast, flexible, and designed for production-readiness, capable of performing 10-NN nearest neighbor searches on millions of objects in milliseconds.

:::tip
To learn more about leveraging Weaviate as your retrieval engine, [visit this page](https://weaviate.io/).
:::

<div
  style={{
    display: "flex",
    alignItems: "center",
    justifyContent: "center",
    flexDirection: "column",
  }}
>
  <img
    src="https://weaviate.io/assets/images/rag-base-00430efd1764c948a95b4d1bbd1e19a9.png"
    style={{
      marginTop: "10px",
      height: "auto",
      maxHeight: "800px",
    }}
  />
  <div
    style={{
      fontSize: "13px",
      marginTop: "10px",
      marginBottom: "30px",
    }}
  >
    RAG pipeline with Weaviate retrieval engine (source: Weaviate)
  </div>
</div>

Youn can easily evaluate your **Weaviate** retriever with DeepEval to find the best hyperparameters for your Weaviate engine. This parameters include `with_limit` (top-K) and `vectorizer` (embedding model), among many others.

:::info
You can quickly get started with Weaviate by running the following command in your CLI:

```
pip install weaviate-client
```

:::

## Setup Weaviate

To start using Weaviate, establish a connection to your local or cloud-hosted instance by initializing a Weaviate client and configuring authentication with your API key.

```python
import weaviate
import os

client = weaviate.Client(
    url="http://localhost:8080",  # Change this if using Weaviate Cloud
    auth_client_secret=weaviate.AuthApiKey(os.getenv("WEAVIATE_API_KEY"))  # Set your API key
)
```

To enable efficient similarity search, define a **Weaviate schema** that stores documents with a `text` property for raw content and an associated vector for embeddings. Since Weaviate supports both internal and external vectorization, this schema is configured to use an external embedding model.

```python
...

# Define the schema
class_name = "Document"
if not client.schema.exists(class_name):
    schema = {
        "classes": [
            {
                "class": class_name,
                "vectorizer": "none",  # Using an external embedding model
                "properties": [
                    {"name": "text", "dataType": ["text"]},  # Stores chunk text
                ]
            }
        ]
    }
    client.schema.create(schema)
```

Before adding documents to Weaviate, convert text into vector representations using an embedding model. We'll be using `all-MiniLM-L6-v2` from `sentence_transformers`.

:::tip
Using an external embedding model ensures flexibility in choosing the most suitable representation for your data, which can be important if your Weaviate engine is struggling to score well on metrics like `Contextual Precision`.
:::

```python
...

# Load an embedding model
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("all-MiniLM-L6-v2")

# Example document chunks
document_chunks = [
    "Weaviate is a cloud-native vector database for scalable AI search.",
    "Weaviate enables fast semantic search across millions of vectors.",
    "It integrates with external embedding models for custom vectorization.",
    ...
]
# Store chunks with embeddings
with client.batch as batch:
    for i, chunk in enumerate(document_chunks):
        embedding = model.encode(chunk).tolist()  # Convert text to vector
        batch.add_data_object(
            {"text": chunk}, class_name=class_name, vector=embedding
        )
```

## Evaluating Weaviate Retrieval

Once the Weaviate retriever is set up, we can begin evaluating its effectiveness in returning relevant contexts. This involves:

- **Constructing a Test Case**: to do so, define an `input` query that represents a typical search scenario and prepare the expected output. Then generate the `actual_output` for the given input and extract the retrieved context during generation.
- **Evaluating the Test Case**: simply run deepeval's `evaluate` function on your populated test case and selection of retriever metrics.

### Preparing your Test Case

The first step in generating the `actual_output` from your RAG pipeline is retrieving the relevant `retrieval_context` from your Qdrant collection based on the input query. Below is a function that encodes the query, searches for the top 3 most relevant vectors in Qdrant, and extracts the corresponding text from the retrieved results.

```python
...

def search(query):
    query_embedding = model.encode(query).tolist()

    result = client.query.get("Document", ["text"]) \
        .with_near_vector({"vector": query_embedding}) \
        .with_limit(3) \
        .do()

    return [hit["text"] for hit in result["data"]["Get"]["Document"]] if result["data"]["Get"]["Document"] else None

query = "How does Weaviate work?"
retrieval_context = search(query)
```

Next, incorporate the retrieved context into your LLM's prompt template to generate a response.

```python
prompt = """
Answer the user question based on the supporting context.

User Question:
{input}

Supporting Context:
{retrieval_context}
"""

actual_output = generate(prompt)  # Replace with your LLM function
print(actual_output)
```

With both the `actual_output` and `retrieval_context` generated, we now have all the necessary parameters to construct our test case:

```python
from deepeval.test_case import LLMTestCase

test_case = LLMTestCase(
    input=input,
    actual_output=actual_output,
    retrieval_context=retrieval_context,
    expected_output="Weaviate is a powerful vector database for AI applications, optimized for efficient semantic retrieval.",
)
```

Before proceeding with the evaluation, let's examine the generated `actual_output`.

```
Weaviate is a cloud-native vector database that enables fast semantic search using vector embeddings and hybrid retrieval.
```

### Running Evaluations

To evaluate an `LLMTestCase`, define the relevant retrieval metrics and pass them into the `evaluate` function along with the test case.

```python
from deepeval.metrics import (
    ContextualRecallMetric,
    ContextualPrecisionMetric,
    ContextualRelevancyMetric,
)
from deepeval import evaluate

...

contextual_recall = ContextualRecallMetric(),
contextual_precision = ContextualPrecisionMetric()
contextual_relevancy = ContextualRelevancyMetric()

evaluate(
    [test_case],
    metrics=[contextual_recall, contextual_precision, contextual_relevancy]
)
```

## Improving Weaviate Retrieval

Once you've evaluated your Weaviate retriever, it's time to analyze the results and fine-tune your retrieval pipeline. Below are example evaluation results from more test cases.

| Query                                       | Contextual Precision | Contextual Recall | Contextual Relevancy |
| ------------------------------------------- | -------------------- | ----------------- | -------------------- |
| "How does Weaviate store vector data?"      | 0.62                 | 0.95              | 0.50                 |
| "Explain Weaviate's indexing method."       | 0.55                 | 0.89              | 0.47                 |
| "What makes Weaviate efficient for search?" | 0.68                 | 0.91              | 0.53                 |

- **Contextual Precision is suboptimal** → Some retrieved contexts might be too generic or off-topic.
- **Contextual Recall is strong** → Weaviate is retrieving enough relevant documents.
- **Contextual Relevancy is inconsistent** → The quality of retrieved documents varies across queries.

:::info
Each metric is impacted by specific retrieval hyperparameters. To understand how these affect your results, refer to [this RAG evaluation guide](/guides/guides-rag-evaluation).
:::

### Improving Retrieval Quality

To enhance retrieval performance, experiment with the following Weaviate hyperparameters:

1. **Tuning `with_limit` (Top-K retrieval)**

   - If precision is low, reduce `with_limit` to retrieve fewer but more accurate results.
   - If recall is too high with irrelevant results, adjust `with_limit` to balance quantity and quality.

2. **Optimizing `vectorizer` (embedding model)**

   - Test alternative embedding models for better domain-specific retrieval:
     - `BAAI/bge-small-en` for ranking improvements.
     - `nomic-ai/nomic-embed-text-v1` for retrieving longer-form documents.
     - `msmarco-distilbert-base-v4` for passage retrieval.

3. **Implementing Hybrid Retrieval (Vector + BM25)**

   - If Weaviate’s pure vector search isn’t retrieving precise matches, combining vector search with BM25 keyword retrieval can help.

4. **Applying Advanced Filtering (`nearText`, `where` constraints)**
   - Leverage metadata-based filtering to refine search results and remove less relevant chunks.

### Experimenting With Different Configurations

To systematically test variations in retrieval settings, run multiple test cases and compare contextual metric scores.

```python
# Example of running multiple test cases with different retrieval settings
for vectorizer in ["all-MiniLM-L6-v2", "bge-small-en", "nomic-embed-text-v1"]:
    retrieval_context = search(query, vectorizer)

    test_case = LLMTestCase(
        input=query,
        actual_output=llm.generate(query, retrieval_context),
        retrieval_context=retrieval_context,
        expected_output="Weaviate is an optimized vector database for AI applications.",
    )

    evaluate([test_case], metrics=[contextual_recall, contextual_precision, contextual_relevancy])
```

### Tracking Improvements

After tuning your Weaviate retriever, monitor improvements in **Contextual Precision**, **Contextual Recall**, and **Contextual Relevancy** to determine the best hyperparameter combination.

:::tip
For structured tracking of retrieval performance and hyperparameter comparisons, [Confident AI](https://www.confident-ai.com/) provides real-time evaluation analysis.
:::
